Hierarchical Stress Modeling in Mandarin Text-to-Speech
نویسندگان
چکیده
Automatic stress prediction is helpful for both speech synthesis and natural speech understanding. This paper proposes a novel hierarchical Mandarin stress modeling method. The top level emphasizes stressed syllables, while the bottom level focuses on unstressed syllables for the first time due to its importance in both naturalness and expressiveness of synthetic speech. Maximum Entropy model is adopted to predict stress structure from textual features. Experiments show that the modeling method could capture the macroand micro-characteristics of stress successfully. The F-score of two-level stress predictions are 73.3% and 78.7%, respectively, which are satisfactory compared to other prosody predictions.
منابع مشابه
Hierarchical stress generation with Fujisaki model in expressive speech synthesis
This paper introduces a hierarchical stress generation for expressive speech synthesis. In the previous study, we proposed a novel hierarchical Mandarin stress modeling method, and the text-based stress prediction experiments demonstrates a reliable stress assignment can be obtained from textual features. However, the stress model should be further verified to be an effective and efficient pros...
متن کاملHierarchical stress modeling and generation in mandarin for expressive Text-to-Speech
Expressive speech synthesis has received increased attention in recent times. Stress (or pitch accent) is the perceptual prominence within words or utterances, which contributes to the expressivity of speech. This paper summarizes our contribution to Mandarin expressive speech synthesis. A novel hierarchical stress modeling and generation method for Mandarin is proposed and further integrated i...
متن کاملProsodic modeling in large vocabulary Mandarin speech recognition
The issue of incorporating prosodic information into speech recognition processes has emerged in recent years. In this work we present a complete framework for Mandarin speech recognition with prosodic modeling considering two-level hierarchical prosodic information for Mandarin Chinese. We developed a GMM-based, a decision-tree-based, and a hybrid approach. The best improvements in character r...
متن کاملA New Approach of Speaking Rate Modeling for Mandarin Speech Prosody
A new approach of Mandarin-speech prosody modeling to consider the effects of speaking rate is proposed. The approach is a modification of our previous prosody labeling and modeling method to take speaking rate as a continuous independent variable and let prosodic-acoustic features and some parameters of prosodic models depend on it in order to count its influences. A speaking rate-dependent hi...
متن کاملA set of corpus-based text-to-speech synthesis technologies for Mandarin Chinese
This paper presents a set of corpus-based text-to-speech synthesis technologies for Mandarin Chinese. A large speech corpus produced by a single speaker is used, and the speech output is synthesized from waveform units of variable lengths, with desired linguistic properties, retrieved from this corpus. Detailed methodologies were developed for designing “phonetically rich” and “prosodically ric...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2011